Alex Krenitsky

Kubernetes the Hard, Hard Way: Part I

Kubernetes Cluster Setup and Bootstrapping

Published: 04/20/2026

Introduction

While I’ve worked with Kubernetes for some time now, I wanted to do a deep dive to learn some more of the nuanced concepts, so I decided to set up a cluster the hard way. I looked at Kelsey Hightower’s “Kubernetes the Hard Way” tutorial, and it’s a fantastic tutorial, but I wanted to go a little deepr. It was a little too easy to follow all the steps and use all the bash snippets he lays out to just spin up a whole cluster in a few hours. So here I’ll go over how I set up a cluster the hard, hard way. No easy-to-follow steps or repo full of all the binaries and configs you need in one place, and definitely not using kubeadm.

Infrastructure

Linode has been my go-to cloud service provider for personal projects, and I haven’t had any complaints, so I kept with that for this project. Since setting up a Kubernetes cluster for the tiny workload (this blog) that I am running on it is already way overkill, I tried to keep things as simple as possible. So at minimum I needed four VMs to run everything, one for an external load balancer, one for the control plane node, and two for the worker nodes. I could have run just one worker node, but I thought I might miss out on learning some things related to multi-node dynamics (this turned out to be very true). To bring everything up I used OpenTofu. Using an infrastructure as code setup worked well here because there were many times when I got some configuration wrong, and it was helpful to be able to just destroy everything and bring it back up again.

A few notes of things I did in the .tf configs:

created a module to represent a VM instance
configured Linode instances using locals map to map unique values needed per VM
added a firewall
- The Linode docs were unclear on linking firewall. They say to connect the firewall to the interface, but this wasn’t possible. I just passed the firewall_id to the instance itself, and that worked.
added a stackscript which is Linode’s init script to do stuff like set hostname or tweak some sshd config
added specific internal ipv4 addresses. needed that because I need to know the IPs for manual cert creation later for k8s setup
- added two interfaces to each VM, one public and one private/VLAN

Certificates

After the basic infrastructure was set up, I focused on generating the TLS certificates that I would need. Each component that communicates with the kube-apiserver needs a TLS certificate which will be baked into its kubeconfig file. So the following components need a TLS certificate:

 1admin  # used to create my admin kubeconfig
 2etcd
 3etcd-peer
 4kube-apiserver
 5kube-api-etcd  # for communicating with etcd
 6kube-controller-manager
 7kube-proxy
 8kube-scheduler
 9master-node-0-server
10master-node-0-client
11otelcol  # for OpenTelemetry Collector sidecar
12service-accounts  # this is a key pair not a TLS cert, used by the Token Controller for creating service account tokens
13worker-node-0-server
14worker-node-1-server

Notice that I’m only creating server certificates and not client certificates for the worker nodes. That’s because I’m relying on the Certificate Signing Request API to create client certificates for those nodes for me. The kubelets on the worker nodes are bootstrapped with a special bootstrapping kubeconfig file which points to a bootstrapping token secret that I created. That basically allows the kubelets to start up and authenticate and authorize with just the minimal permissions that allow them to then create a CertificateSigningRequest. The csrapproving controller which is part of the controller manager is responsible for approving CSRs. In order to configure the controller to automatically approve CSRs from bootstrapping nodes, I defined a clusterRoleBinding that attaches the system:certificates.k8s.io:certificatesigningrequests:selfnodeclient role to the system:bootstrappers group. Once the CSR is approved, the kubelet is then able to authenticate and be fully authorized as normal. Similarly, I bootstrapped kube-proxy to run with a service account and use the CSR api.

I used openssl for generating the certs. For each of the above I created an openssl conf file. For example, here’s what the kube-proxy one looked like:

 1[req]
 2distinguished_name = kube-proxy_distinguished_name
 3prompt             = no
 4req_extensions     = kube-proxy_req_extensions
 5
 6[kube-proxy_distinguished_name]
 7CN = system:kube-proxy
 8O  = system:node-proxier
 9
10[kube-proxy_req_extensions]
11basicConstraints     = CA:FALSE
12extendedKeyUsage     = clientAuth
13keyUsage             = critical, digitalSignature, keyEncipherment
14subjectKeyIdentifier = hash

Note, for the kube-apiserver and etcd confs, an alt names section is needed because they are servers. For the kube-apiserver, two important IPs are included in the SAN section, 10.96.0.1, which is the first IP in the cluster service CIDR range, and the internal vlan IP of the host machine.

Also, I’m using the Node Authorizer to authorize kubelets to the kube-apiserver, and, in order for that to work, kubelets need the following in their openssl certs:

1CN = system:node:<node-name>
2O  = system:nodes

Similarly, other components need specific CN and O values to identify them, and match them to specific RBACs. For example, for kube-proxy:

1CN = system:kube-proxy
2O  = system:node-proxier

or for kube-apiserver (client cert to authorize to the kubelet):

1CN = kubernetes
2O  = system:masters

Specifying system:masters makes the kube-apiserver a part of that group. That group membership attaches the kube-apiserver to the cluster-admin ClusterRole which allows it to be authorized by the kubelet when it wants to perform actions or retrieve data from the kubelet.

With all of the above in place, I then made a script to loop over all of the configs and generate a private key, a certificate signing request, and a certificate for each. I wrote another script to export the certificates to where they need to be on the hosts.

Container Runtime

The next step was to set up the container runtime on my control-plane node. The two main options seemed to be either CRI-O or containerd. I decided to try CRI-O instead of containerd, because one of the goals of this project was to set everything up as minimal as possible, and CRI-O seemed to fit the bill for that. Looking back now, I’m not sure if it was worth sacrificing the broad support and community surrounding containerd for a leaner runtime. Resource-wise, it seems that the two runtimes perform fairly similarly, so I don’t know how much benefit I’m getting. But that’s not to say I’m unhappy with CRI-O, no complaints so far.

I also needed a low level container runtime, and here the choice was between the common runc and the less popular, but more minimal and efficient crun. I chose crun, and it’s been working great so far.

Launching Kubelet

Finally I was ready to launch the kubelet. I created a script to handle all the setup on each host machine. It installs the container runtime binaries, it puts the necessary certificates and configs in place, and it does some other minor things like installing packages, turning off swap, and enabling some linux kernel modules. The kernel modules I needed to turn on were br_netfilter and nf_conntrack.

I’ll expand a bit on the network kernel modules a bit. First, a network bridge is like a network switch, allowing various network interfaces to connect. You can create a bridge on your machine with ip link add, and then you can connect different network interfaces with the bridge. Traffic moving through a bridge is operating at the datalink layer (OSI layer 2), and by default, doesn’t pass through the kernel’s netfilter framework which handles packet routing at the network and transport layers (OSI 3 and 4). This means iptables/nftables rules would not affect these packets. So turning on the br_netfilter module changes that and sends that bridge traffic up to the netfilter to be processed by iptables/nftables rules. The nf_conntrack module allows connection tracking for traffic. It tracks the state of network connections, ie NEW, ESTABLISHED, INVALID, etc. It also maintains a table in memory with information about connection flow. This allows Network Address Translation to work, because the kernel can track which external IP:port goes to which internal IP:port. For Kubernetes this is important to allow kube-proxy to keep track of how connections flow from a clusterIP to a pod IP.

With my startup script, I was able to configure everything and finally start up the kubelet on my master node. I worked on only the master node to begin with, getting the kubelet running there and then launching control plane components. Later, I used my startup script again to bring up the kubelets on the worker nodes.

The actual startup was pretty straightforward. I’m running the kubelet as a systemd service, so I just needed to enable the service. This was also handled by my startup script.

In part II of this post I’ll go into more detail on how I deployed the control plane components, and then setup the worker nodes and deployed workloads on those.